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RFC 9285 
The Base45 Data Encoding 


Abstract 


This document describes the Base45 encoding scheme, which is built upon the Base64, Base32, and 
Base16 encoding schemes. 


Status of This Memo 


This document is not an Internet Standards Track specification; it is published for informational 
purposes. 


This document is a product of the Internet Engineering Task Force (IETF). It represents the 
consensus of the IETF community. It has received public review and has been approved for 
publication by the Internet Engineering Steering Group (IESG). Not all documents approved by 
the IESG are candidates for any level of Internet Standard; see Section 2 of RFC 7841. 


Information about the current status of this document, any errata, and how to provide feedback 
on it may be obtained at https://www.rfc-editor.org/info/rfc9285. 


Copyright Notice 


Copyright (c) 2022 IETF Trust and the persons identified as the document authors. All rights 
reserved. 


This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF 
Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this 
document. Please review these documents carefully, as they describe your rights and restrictions 
with respect to this document. Code Components extracted from this document must include 
Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are 
provided without warranty as described in the Revised BSD License. 
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1. Introduction 


August 2022 


A QR code is used to encode text as a graphical image. Depending on the characters used in the 
text, various encoding options for a QR code exist, e.g., Numeric, Alphanumeric, and Byte mode. 
Even in Byte mode, a typical QR code reader tries to interpret a byte sequence as text encoded in 
UTF-8 or ISO/IEC 8859-1. Thus, QR codes cannot be used to encode arbitrary binary data directly. 
Such data has to be converted into an appropriate text before that text could be encoded as a QR 
code. Compared to already established Base64, Base32, and Base16 encoding schemes that are 
described in [RFC4648], the Base45 scheme described in this document offers a more compact QR 
code encoding. 


One important difference from those others and Base45 is the key table and that the padding with 


wot 
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2. Conventions Used in This Document 


The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", 
"RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be 
interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 
capitals, as shown here. 


3. Interpretation of Encoded Data 


Encoded data is to be interpreted as described in [RFC4648] with the exception that a different 
alphabet is selected. 


4. The Base45 Encoding 


QR codes have a limited ability to store binary data. In practice, binary data have to be encoded 
in characters according to one of the modes already defined in the standard for QR codes. The 
easiest mode to use in called Alphanumeric mode (see Section 7.3.4 and Table 2 of [ISO018004]. 
Unfortunately Alphanumeric mode uses 45 different characters which implies neither Base32 nor 
Base64 are very effective encodings. 


A 45-character subset of US-ASCII is used; the 45 characters usable in a QR code in Alphanumeric 
mode (see Section 7.3.4 and Table 2 of [ISO18004]). Base45 encodes 2 bytes in 3 characters, 
compared to Base64, which encodes 3 bytes in 4 characters. 


For encoding, two bytes [a, b] MUST be interpreted as a number n in base 256, i.e. as an unsigned 
integer over 16 bits so that the number n = (a * 256) + b. 


This number n is converted to base 45 [c, d, e] so that n = c + (d * 45) + (e * 45 * 45). Note the order 
ofc, d and e which are chosen so that the left-most [c] is the least significant. 


The values c, d, and e are then looked up in Table 1 to produce a three character string. The 
process is reversed when decoding. 


For encoding a single byte [a], it MUST be interpreted as a base 256 number, i.e. as an unsigned 
integer over 8 bits. That integer MUST be converted to base 45 [c d] so that a = c + (45 * d). The 
values c and d are then looked up in Table 1 to produce a two-character string. 


A byte string [a b c d... x y z] with arbitrary content and arbitrary length MUST be encoded as 
follows: From left to right pairs of bytes MUST be encoded as described above. If the number of 
bytes is even, then the encoded form is a string with a length that is evenly divisible by 3. If the 
number of bytes is odd, then the last (rightmost) byte MUST be encoded on two characters as 
described above. 


For decoding a Base45 encoded string the inverse operations are performed. 
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4.1. When to Use and Not Use Base45 


If binary data is to be stored in a QR code, the suggested mechanism is to use the Alphanumeric 
mode that uses 11 bits for 2 characters as defined in Section 7.3.4 of [I[SO18004]. The Extended 
Channel Interpretation (ECI) mode indicator for this encoding is 0010. 


On the other hand if the data is to be sent via some other transport, a transport encoding suitable 
for that transport should be used instead of Base45. For example, it is not recommended to first 
encode data in Base45 and then encode the resulting string in Base64 if the data is to be sent via 
email. Instead, the Base45 encoding should be removed, and the data itself should be encoded in 
Base64. 


4.2. The Alphabet Used in Base45 


The Alphanumeric mode is defined to use 45 characters as specified in this alphabet. 


Value Encoding Value Encoding Value Encoding Value Encoding 


00 0 W E 24 O 36 Space 
01 1 13 D 25. P 37 $ 
02 2 14 E 26 Q 38 % 
03 3 15 F 27 R 39, 7 
04 4 16 G 20S 40 + 
05 5 17 H 29 T 41 - 
06 6 18 =I 30 U 42 

07 7 19 J 31 V 43 | 
08 8 20 K 32 W 44 

09 9 21 L 33 X 

10 A 22 M 34 Y 

11 B 23 N 35 Z 


Table 1: The Base45 Alphabet 


4.3. Encoding Examples 


It should be noted that although the examples are all text, Base45 is an encoding for binary data 
where each octet can have any value 0-255. 
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Encoding example 1: 


The string "AB" is the byte sequence [[65 6611. If we look at all 16 bits, we get 65 * 256 + 66 = 16706. 
16706 equals 11 + (11 * 45) + (8 * 45 * 45), so the sequence in base 45 is [11 11 8]. Referring to 
Table 1, we get the encoded string "BB8". 


AB Initial string 
[[65 66]] Decimal value 
[16706] Value in base 16 
[11 11 8] Value in base 45 
BB8 Encoded string 


Table 2: Example 1 in Detail 
Encoding example 2: 


The string "Hello!!" as ASCII is the byte sequence [[72 101] [108 108] [111 33] [33]]. If we look at 
this 16 bits at a time, we get [18533 27756 28449 33]. Note the 33 for the last byte. When looking 
at the values in base 45, we get [[38 6 9] [36 31 13] [9 2 14] [33 0]], where the last byte is 
represented by two values. The resulting string "%69 VD92EX0" is created by looking up these 
values in Table 1. It should be noted it includes a space. 


Hello!! Initial string 
[[72 101] [108 108] [111 33] [33]] | Decimal value 
[18533 27756 28449 33] Value in base 16 
[[38 6 9] [36 31 13] [9 214] [330]] Value in base 45 


2069 VD92EX0 Encoded string 
Table 3: Example 2 in Detail 


Encoding example 3: 


The string "base-45" as ASCII is the byte sequence [[98 97] [115 101] [45 52] [53]]. If we look at 
this two bytes at a time, we get [25185 29541 11572 53]. Note the 53 for the last byte. When 
looking at the values in base 45, we get [[30 19 12] [21 26 14] [7 32 5] [8 1]] where the last byte is 
represented by two values. Referring to Table 1, we get the encoded string "UJCLQE7W581". 


base-45 Initial string 


[[98 97] [115 101] [45 52] [53]] Decimal value 
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[25185 29541 11572 53] Value in base 16 
[[30 19 12] [21 26 14] [7325] [8 1]] Value in base 45 
UJCLQE7W581 Encoded string 


Table 4: Example 3 in Detail 


4.4. Decoding Example 


Decoding example 1: 


The string "QED8WEX0" represents, when looked up in Table 1, the values [26 14 13 8 32 14 33 0]. 
We arrange the numbers in chunks of three, except for the last one which can be two numbers, 
and get [[26 14 13] [8 32 14] [33 0]]. In base 45, we get [26981 29798 33] where the bytes are [[105 
101] [116 102] [33]]. If we look at the ASCII values, we get the string "ietf!". 


QED8WEX0 Initial string 

[26 14 13 8 32 14 33 0] Looked up values 

[[26 14 13] [8 32 14] [33 0]] Groups of three 
[26981 29798 33] Interpreted as base 45 
[[105 101] [116 102] [33]] Values in base 8 


ietf! Decoded string 


Table 5: Example 4 in Detail 


5. IANA Considerations 


This document has no IANA actions. 


6. Security Considerations 


When implementing encoding and decoding it is important to be very careful so that buffer 
overflow or similar issues do not occur. This of course includes the calculations in base 45 and 
lookup in the table of characters (Table 1). Adecoder must also be robust regarding input, 
including proper handling of any octet value 0-255, including the NUL character (ASCII 0). 


It should be noted that Base64 and some other encodings pad the string so that the encoding 
starts with an aligned number of characters while Base45 specifically avoids padding. Because of 
this, special care has to be taken when an odd number of octets is to be encoded. Similarly, care 
must be taken if the number of characters to decode are not evenly divisible by 3. 
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Base encodings use a specific, reduced alphabet to encode binary data. Non-alphabet characters 
could exist within base-encoded data, caused by data corruption or by design. Non-alphabet 
characters may be exploited as a "covert channel", where non-protocol data can be sent for 
nefarious purposes. Non-alphabet characters might also be sent in order to exploit 
implementation errors leading to, for example, buffer overflow attacks. 


Implementations MUST reject any input that is not a valid encoding. For example, it MUST reject 
the input (encoded data) if it contains characters outside the base alphabet (in Table 1) when 
interpreting base-encoded data. 


Even though a Base45-encoded string contains only characters from the alphabet in Table 1, 
cases like the following have to be considered: The string "FGW" represents 65535 (FFFF in base 
16), which is a valid encoding of 16 bits. A slightly different encoded string of the same length, 
"GGW", would represent 65536 (10000 in base 16), which is represented by more than 16 bits. 
Implementations MUST also reject the encoded data if it contains a triplet of characters that, 
when decoded, results in an unsigned integer that is greater than 65535 (FFFF in base 16). 


It should be noted that the resulting string after encoding to Base45 might include non-URL-safe 
characters so if the URL including the Base45 encoded data has to be URL-safe, one has to use 
percent-encoding. 
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